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Abstract. The aim of this study is to optimize CALL environments as a learning tool 
rather than a gloss, focusing on the learning of polysemous words which refer to spatial 
relationship between objects. A lot of research has already been conducted to examine 
the efficacy of visual glosses while reading L2 texts and has reported that visual glosses 
can be effective for incidental vocabulary learning. This study, however, discusses 
the efficacy of visual aids on vocabulary learning, from the following three different 
standpoints. The first point is that previous studies have not covered the meaning of 
these words and have concluded the aids become effective regardless of the part of 
speech of these words. That is, some words are easy to learn, but the others are difficult 
depending on the part of speech. Paying more attention to the meaning structures is 
necessary. The second is that previous studies have focused on visual aids in vocabulary 
learning using CALL in terms of a gloss while reading texts. As CALL environments 
have been developed, however, glosses are now used not merely as a reference tool, 
but as a learning tool. Finally, a lot of research on vocabulary learning with multimedia 
has been conducted in reading activities, although visual aids can be effective in other 
activities such as listening activities, in which deeper discourse comprehension is 
required. Taking these standpoints into consideration, we hypothesize that the intentional 
learning of those words with multimedia-oriented visual aids could enhance not only 
comprehension of vocabulary meanings but also better comprehension of a script that 
includes those words. To examine our hypothesis, we will conduct an experimental study 
with computer-mediated learning material for English prepositions which we developed 
for this study. The findings of this study can contribute to the better CALL environments, 
leading to more effective web or mobile-based learning tools. 
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1. Introduction 

The efficacy of pictures or images as visual glosses has been discussed in L2 vocabulary 
learning and CALL. Most of the studies related to this issue focus on the efficacy 
for the long retention through incidental learning such as reading (for example, Al- 
Seghayer, 2001; Chun & Plass, 1996; Lomicka, 1998; Yoshii & Fraitz, 2002). With 
the development of CALL with multimodality, however, an electronic dictionary as 
a collective of glosses has been used not only for reference tools but also for learning 
tools (Pachler, 2001) to intentionally obtain certain vocabulary items, for example. In 
that sense, it would be better to reexamine the conditions of better use of glosses in 
intentional L2 vocabulary learning, leading, we believe, to the development of CALL 
materials. We therefore would like to address this issue by comparing planar still 
images and animated stereo ones, which depicted conceptual schemes of L2 spatial 
prepositions. 

2. Background 

As mentioned above, many studies concerning L2 vocabulary learning with visual 
glosses have been conducted and then recognized for their positive impact on L2 
vocabulary learning. However, we have concerns about the fact that the previous 
studies focus less on the following points (Sato & Suzuki, 2010, 2011); the first is 
that they may examine L2 vocabulary learning as incidental learning. As a result, less 
discussion has not been made about the relationship between certain type of glosses 
and vocabulary. In addition, they may regard longer retention of words as the goal of 
successful learning. 

In this study, on the other hand, L2 vocabulary should be learned intentionally 
because we know some types of words which are easier to learn and also harder to learn 
in terms of meanings. Given our claim that comprehension of the meanings is regarded 
as successful L2 vocabulary learning, we revalidate the effectiveness of pictures or 
images as visual glosses. 

3. Our study 

Therefore, our study focuses on prepositions, schematic image as a visual gloss, 
and pictorial image and live-action image. The reason we focus on English spatial 
prepositions is owing to the fact that learning English prepositions is regarded as 
important but difficult. This is because prepositions appear very frequently in any 
discourse, but learners do not always understand their meanings (Lindstromberg, 
1996). They might learn prepositions as idioms or chunks, but they cannot use them 
according to context only by memorization (Lindstromberg, 2001). In addition, L2 
translation of a sense of the word may confuse us because the connection among the 
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senses becomes unclear (Tanaka, 1990). These problems L2 learners have encountered 
in learning L2 prepositions show that more focus on meanings is needed than on the 
retention of the vocabulary. 

Then we focus on image schema as a visual gloss to learn English prepositions. 
Johnson (1987) defines “image schemata [as] abstract patterns in our experience and 
understanding that are not propositional” (p. 2), which can be served as a mediator 
to connect the senses of the word. The image schema can reflect the prototypical 
sense of the word, but it can be extended into other examples. As a result, the image 
can cover all the senses. This means that if L2 learners understand image schema 
as a medium of each sense of the word, they could differentiate senses of other 
prepositions. 

Finally, we compare planar still images with animated stereo images. This is 
because both images are theoretically supported by different frameworks: CALL and 
Cognitive Linguistics. CALL research supports the effectiveness of animation as we see 
Al-Seghayer’s (2001) research. On the other hand, in the field of cognitive linguistics, 
from which the image schema was derived, schematic images have flexibility and 
changeability such as their foregrounding, rotation and focusing (Langacker, 1987), 
which implies that simple image is better while live-motion images might prevent 
learners from modifying the images due to their fixed configuration. 

Therefore, our research question is whether planar still images or animated stereo 
ones can serve a better facilitator to learn the meanings of English prepositions. We will 
explain the detail of our experimental research in the next chapter. 

4. Research 

4.1. Procedures 

Fifty-two Japanese university students from freshmen to postgraduates joined our 
research. They are from the university the first author works at and are not majoring in 
English language. They were randomly divided into two groups: a control group and 
an experimental group. First of all, both participants were asked to answer multiple 
questions about the sense of eight spatial prepositions: above, across, along, below, 
in, into, on and over. The test consists of forty-five questions and no feedback was 
given after the test. Then they learned the sense of prepositions using the two kinds of 
dictionaries we gave them respectively for ten minutes. They were then asked to answer 
the post vocabulary test which consists of the same questions as the pre-test, but they 
are randomized. The data we have got are analysed through ANOVA with between and 
within subject variables. 

4.2. Findings 

The result we have got from the analysis shows that in both groups participants could 
get higher scores in post-test than those in pre-test. As a result of ANOVA, in terms of 
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image, a big difference was found (Images: F(l,50) = .018, p > .05; Tests: 7 7 (1,50) = 
112.5, p < .05). The results tell us that there is no statistical significance between the 
two groups. However, significant difference is found between pre-test and post-test 
after the treatments, while no interaction between the two factors is found (Figure 1). 
These results may indicate visual glosses can facilitate intentional learning of senses of 
English prepositions even if they are planar still images or animated stereo ones, which 
is different from the results many studies relating to L2 vocabulary learning with visual 
glosses. 

Figure 1. The result of ANOVA analysis 


A = Image 
B = Test 


s.v 

SS df 

MS F 

A 

0.4712 1 

0.4712 0.02 ns 

subj 

1343.2500 50 

26.8650 

B 

706.1635 1 

706.1635 112.49* 

AxB 

0.4712 1 

0.4712 0.08 ns 

sxB 

313.8654 50 

6.2773 

J 


Total 2364.2212 103 +p<.10 *p<.05 **p<.01 
5. Discussion and conclusion 

In conclusion, our experimental research shows that the images as visual glosses can 
be a good facilitator of learning L2 prepositions regardless of their configurations. We 
have to admit, however, more analysis must be conducted like a delayed test, but our 
previous studies also show the same result. In terms of L2 vocabulary learning in this 
setting, technologically advanced visual glosses do not always bring about a better 
result. Of course, this result does not mean using technology or multimedia functions 
has no meaning, all we want to say is optimization of the glosses according to the target 
should be required. We have to think about the condition to make learning successful. 
Either way, further research is needed. 
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